AITopics

Industry: Law > Litigation (0.66)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Neural Information Processing SystemsDec-23-2025, 17:51:11 GMT

Predictive Coding beyond Gaussian Distributions

A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation~(BP). A prominent example is predictive coding (PC), which is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods, however, fail to keep up with modern neural networks, as they are unable to replicate the dynamics of complex layers and activation functions. In this work, we solve this problem by generalizing PC to arbitrary probability distributions, enabling the training of architectures, such as transformers, that are hard to approximate with only Gaussian assumptions. We perform three experimental analyses. First, we study the gap between our method and the standard formulation of PC on multiple toy examples. Second, we test the reconstruction quality on variational autoencoders, where our method reaches the same reconstruction quality as BP. Third, we show that our method allows us to train transformer networks and achieve performance comparable with BP on conditional language models. More broadly, this method allows neuroscience-inspired learning to be applied to multiple domains, since the internal distributions can be flexibly adapted to the data, tasks, and architectures used.

gaussian distribution, name change, predictive coding, (5 more...)

Industry: Law > Litigation (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Goemaere, Cédric, Oliviers, Gaspard, Bogacz, Rafal, Demeester, Thomas

ePC: Overcoming Exponential Signal Decay in Deep Predictive Coding Networks

arXiv.org Artificial IntelligenceSep-30-2025

Predictive Coding (PC) offers a biologically plausible alternative to backpropagation for neural network training, yet struggles with deeper architectures. This paper identifies the root cause and provides a principled solution. We uncover that the canonical state-based formulation of PC (sPC) is, by design, deeply inefficient on digital hardware, due to an inherent signal decay problem that scales exponentially with depth. To address this fundamental limitation, we introduce a novel reparameterization of PC, named error-based PC (ePC), which does not suffer from signal decay. By optimizing over prediction errors rather than states, ePC enables signals to reach all layers simultaneously and unattenuated, converging orders of magnitude faster than sPC. Experiments across multiple architectures and datasets demonstrate that ePC matches backpropagation's performance even for deeper models where sPC struggles. Besides practical improvements, our work provides theoretical insight into PC dynamics and establishes a foundation for scaling bio-inspired learning to deeper architectures on digital hardware and beyond.

artificial intelligence, epc, machine learning, (18 more...)

2505.20137

Country: Europe (0.45)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Law > Litigation (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Schoinas, Thanasis, Qadir, Ghulam

A Feasibility Experiment on the Application of Predictive Coding to Instant Messaging Corpora

arXiv.org Artificial IntelligenceAug-18-2025

Predictive coding, the term used in the legal industry for document classification using machine learning, presents additional challenges when the dataset comprises instant messages, due to their informal nature and smaller sizes. In this paper, we exploit a data management workflow to group messages into day chats, followed by feature selection and a logistic regression classifier to provide an economically feasible predictive coding solution. We also improve the solution's baseline model performance by dimensionality reduction, with focus on quantitative features. We test our methodology on an Instant Bloomberg dataset, rich in quantitative information. In parallel, we provide an example of the cost savings of our approach.

artificial intelligence, instant messaging corpora, machine learning, (3 more...)

2508.11084

Genre: Research Report (0.89)

Industry: Law > Litigation (0.80)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Neural Information Processing SystemsJan-19-2025, 07:42:13 GMT

Learning on Arbitrary Graph Topologies via Predictive Coding

Training with backpropagation (BP) in standard deep learning consists of two main steps: a forward pass that maps a data point to its prediction, and a backward pass that propagates the error of this prediction back through the network. This process is highly effective when the goal is to minimize a specific objective function. However, it does not allow training on networks with cyclic or backward connections. This is an obstacle to reaching brain-like capabilities, as the highly complex heterarchical structure of the neural connections in the neocortex are potentially fundamental for its effectiveness. In this paper, we show how predictive coding (PC), a theory of information processing in the cortex, can be used to perform inference and learning on arbitrary graph topologies.

arbitrary graph topology, learning, predictive coding

Industry: Law > Litigation (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

arXiv.org Artificial IntelligenceDec-21-2024

ActPC-Chem: Discrete Active Predictive Coding for Goal-Guided Algorithmic Chemistry as a Potential Cognitive Kernel for Hyperon & PRIMUS-Based AGI

Goertzel, Ben

We explore a novel paradigm (labeled ActPC-Chem) for biologically inspired, goal-guided artificial intelligence (AI) centered on a form of Discrete Active Predictive Coding (ActPC) operating within an algorithmic chemistry of rewrite rules. ActPC-Chem is envisioned as a foundational "cognitive kernel" for advanced cognitive architectures, such as the OpenCog Hyperon system, incorporating essential elements of the PRIMUS cognitive architecture. The central thesis is that general-intelligence-capable cognitive structures and dynamics can emerge in a system where both data and models are represented as evolving patterns of metagraph rewrite rules, and where prediction errors, intrinsic and extrinsic rewards, and semantic constraints guide the continual reorganization and refinement of these rules. Using a virtual "robot bug" thought experiment, we illustrate how such a system might self-organize to handle challenging tasks involving delayed and context-dependent rewards, integrating causal rule inference (AIRIS) and probabilistic logical abstraction (PLN) to discover and exploit conceptual patterns and causal constraints. Next, we describe how continuous predictive coding neural networks, which excel at handling noisy sensory data and motor control signals, can be coherently merged with the discrete ActPC substrate. Finally, we outline how these ideas might be extended to create a transformer-like architecture that foregoes traditional backpropagation in favor of rule-based transformations guided by ActPC. This layered architecture, supplemented with AIRIS and PLN, promises structured, multi-modal, and logically consistent next-token predictions and narrative sequences.

artificial intelligence, machine learning, write rule, (17 more...)

2412.16547

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Industry:

Law > Litigation (0.82)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Neural Information Processing SystemsOct-9-2024, 17:44:07 GMT

Associative Memories via Predictive Coding

associative memory, predictive coding, sensory neuron

Industry: Law > Litigation (0.64)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Neural Information Processing SystemsOct-9-2024, 12:48:47 GMT

Predictive Coding beyond Gaussian Distributions

A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation (BP). A prominent example is predictive coding (PC), which is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods, however, fail to keep up with modern neural networks, as they are unable to replicate the dynamics of complex layers and activation functions. In this work, we solve this problem by generalizing PC to arbitrary probability distributions, enabling the training of architectures, such as transformers, that are hard to approximate with only Gaussian assumptions. We perform three experimental analyses.

gaussian distribution, predictive coding, reconstruction quality, (2 more...)

Industry: Law > Litigation (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Luu, Tung M., Lee, Donghoon, Yoo, Chang D.

Predictive Coding for Decision Transformer

arXiv.org Artificial IntelligenceOct-4-2024

Recent work in offline reinforcement learning (RL) has demonstrated the effectiveness of formulating decision-making as return-conditioned supervised learning. Notably, the decision transformer (DT) architecture has shown promise across various domains. However, despite its initial success, DTs have underperformed on several challenging datasets in goal-conditioned RL. This limitation stems from the inefficiency of return conditioning for guiding policy learning, particularly in unstructured and suboptimal datasets, resulting in DTs failing to effectively learn temporal compositionality. Moreover, this problem might be further exacerbated in long-horizon sparse-reward tasks. To address this challenge, we propose the Predictive Coding for Decision Transformer (PCDT) framework, which leverages generalized future conditioning to enhance DT methods. PCDT utilizes an architecture that extends the DT framework, conditioned on predictive codings, enabling decision-making based on both past and future factors, thereby improving generalization. Through extensive experiments on eight datasets from the AntMaze and FrankaKitchen environments, our proposed method achieves performance on par with or surpassing existing popular value-based and transformer-based methods in offline goal-conditioned RL. Furthermore, we also evaluate our method on a goal-reaching task with a physical robot.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2410.03408

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Law > Litigation (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceAug-31-2024

Contrastive Augmentation: An Unsupervised Learning Approach for Keyword Spotting in Speech Technology

Dai, Weinan, Jiang, Yifeng, Liu, Yuanjing, Chen, Jinkun, Sun, Xin, Tao, Jinglei

This paper addresses the persistent challenge in Keyword Spotting (KWS), a fundamental component in speech technology, regarding the acquisition of substantial labeled data for training. Given the difficulty in obtaining large quantities of positive samples and the laborious process of collecting new target samples when the keyword changes, we introduce a novel approach combining unsupervised contrastive learning and a unique augmentation-based technique. Our method allows the neural network to train on unlabeled data sets, potentially improving performance in downstream tasks with limited labeled data sets. We also propose that similar high-level feature representations should be employed for speech utterances with the same keyword despite variations in speed or volume. To achieve this, we present a speech augmentation-based unsupervised learning method that utilizes the similarity between the bottleneck layer feature and the audio reconstructing information for auxiliary training. Furthermore, we propose a compressed convolutional architecture to address potential redundancy and non-informative information in KWS tasks, enabling the model to simultaneously learn local features and focus on long-term information. This method achieves strong performance on the Google Speech Commands V2 Dataset. Inspired by recent advancements in sign spotting and spoken term detection, our method underlines the potential of our contrastive learning approach in KWS and the advantages of Query-by-Example Spoken Term Detection strategies. The presented CAB-KWS provide new perspectives in the field of KWS, demonstrating effective ways to reduce data collection efforts and increase the system's robustness.

classification accuracy, keyword, learning, (13 more...)

2409.00356

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)